Storing auxiliary data for efficient maintenance and lineage tracing of complex views
نویسندگان
چکیده
As views in a data warehouse become more complex, the view maintenance process can become very complicated and potentially very inefficient. Storing auxiliary views in the warehouse can reduce the complexity and improve the efficiency of view maintenance, and the same auxiliary views can help in efficiently answering lineage tracing queries over the warehouse views. In this paper, we study the problem of selecting auxiliary views to materialize in order to minimize the total view maintenance and lineage tracing cost. We consider relational views with arbitrary use of aggregation operators, and we define an initial search space for our optimization problem based on a normal form for such view definitions. We present several auxiliary view selection algorithms, and to study their performance we conduct experiments using the TPC-D benchmark in addition to synthetic view definitions and statistics. The results of our experiments show: (1) the exhaustive algorithm that selects the optimal set of auxiliary views is far too expensive in many cases; (2) two heuristic algorithms that we present select good (often optimal) sets of auxiliary views in a much shorter time; (3) even auxiliary views selected by a very simple algorithm can significantly reduce the overall view maintenance and lineage tracing cost.
منابع مشابه
Lineage Tracing in a Data Warehousing System
A data warehousing system collects data from multiple distributed sources and stores the integrated information as materialized views in a local data warehouse. Users then perform data analysis and mining on the warehouse views. Figure 1 shows the basic architecture of a data warehousing system. In many cases, the warehouse view contents alone are not su cient for in-depth analysis. It is often...
متن کاملLineage Tracing in a Data Warehousing System Demonstration Proposal
A data warehousing system collects data from multiple distributed sources and stores the inte grated information as materialized views in a local data warehouse Users then perform data analysis and mining on the warehouse views Figure shows the basic architecture of a data warehousing system In many cases the warehouse view contents alone are not su cient for in depth analysis It is often usefu...
متن کاملPractical Lineage Tracing in Data Warehouses
We consider the view data lineage problem in a warehousing environment For a given data item in a materialized warehouse view we want to identify the set of source data items that produced the view item We formalize the problem and we present a lineage tracing algorithm for relational views with aggregation Based on our tracing algorithm we propose a number of schemes for storing auxiliary view...
متن کاملIncremental Maintenance of Data Warehouses Based on Past Temporal Logic Operators
We see a temporal data warehouse as a set of temporal views defined in the past fragment of the temporal relational algebra extended with set-valued attributes and aggregation. This paper proposes an incremental maintenance method for temporal views that allows improvements over the re-computation from scratch. We introduce a formalism for temporal data warehouse specification that summarizes i...
متن کاملQuery Processing with Materialized Views in a Traceable P2P Record Exchange Framework
Materialized views which are derived from base relations and stored in the database are often used to speed up query processing. In this paper, we leverage them in a traceable peer-to-peer (P2P) record exchange framework which was proposed to ensure reliability among the exchanged data in P2P networks where duplicates and modifications of data occur independently in autonomous peers. In our pro...
متن کامل